Cost-Sensitive Learning for Emotion Robust Speaker Recognition
نویسندگان
چکیده
In the field of information security, voice is one of the most important parts in biometrics. Especially, with the development of voice communication through the Internet or telephone system, huge voice data resources are accessed. In speaker recognition, voiceprint can be applied as the unique password for the user to prove his/her identity. However, speech with various emotions can cause an unacceptably high error rate and aggravate the performance of speaker recognition system. This paper deals with this problem by introducing a cost-sensitive learning technology to reweight the probability of test affective utterances in the pitch envelop level, which can enhance the robustness in emotion-dependent speaker recognition effectively. Based on that technology, a new architecture of recognition system as well as its components is proposed in this paper. The experiment conducted on the Mandarin Affective Speech Corpus shows that an improvement of 8% identification rate over the traditional speaker recognition is achieved.
منابع مشابه
Recognizing and regulating e-learners' emotions based on interactive Chinese texts in e-learning systems
Emotional illiteracy exists in current e-learning environment, which will decay learning enthusiasm and productivity, and now gets more attentions in recent researches. Inspired by affective computing and active listening strategy, in this paper, a research and application framework of recognizing emotion based on textual interaction is presented first. Second, an emotion category model for e-l...
متن کاملReusing Neural Speech Representations for Auditory Emotion Recognition
Acoustic emotion recognition aims to categorize the affective state of the speaker and is still a difficult task for machine learning models. The difficulties come from the scarcity of training data, general subjectivity in emotion perception resulting in low annotator agreement, and the uncertainty about which features are the most relevant and robust ones for classification. In this paper, we...
متن کاملReal-Time Tracking of Speakers' Emotions, States, and Traits on Mobile Platforms
We demonstrate audEERING’s sensAI technology running natively on low-resource mobile devices applied to emotion analytics and speaker characterisation tasks. A show-case application for the Android platform is provided, where audEERING’s highly noise robust voice activity detection based on LSTM-RNN is combined with our core emotion recognition and speaker characterisation engine natively on th...
متن کاملSpeaker-independent emotion recognition based on feature vector classification
This paper proposes a new feature vector classification for speech emotion recognition. The conventional feature vector classification applied to speaker identification categorized feature vectors as overlapped and non-overlapped. This method discards all of the overlapped vectors in model training, while non-overlapped vectors are used to reconstruct corresponding speaker models. Although the ...
متن کاملExtended weighted linear prediction using the autocorrelation snapshot - a robust speech analysis method and its application to recognition of vocal emotions
Temporally weighted linear predictive methods have recently been successfully used for robust feature extraction in speech and speaker recognition. This paper introduces their general formulation, where various efficient temporal weighting functions can be included in the optimization of the all-pole coefficients of a linear predictive model. Temporal weighting is imposed by multiplying element...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
دوره 2014 شماره
صفحات -
تاریخ انتشار 2014